Speech Translation has always been about giving source text or audio inputand waiting for system to give translated output in desired form. In thispaper, we present the Acoustic Dialect Decoder (ADD) - a voice to voiceear-piece translation device. We introduce and survey the recent advances madein the field of Speech Engineering, to employ in the ADD, particularly focusingon the three major processing steps of Recognition, Translation and Synthesis.We tackle the problem of machine understanding of natural language by designinga recognition unit for source audio to text, a translation unit for sourcelanguage text to target language text, and a synthesis unit for target languagetext to target language speech. Speech from the surroundings will be recordedby the recognition unit present on the ear-piece and translation will start assoon as one sentence is successfully read. This way, we hope to give translatedoutput as and when input is being read. The recognition unit will use HiddenMarkov Models (HMMs) Based Tool-Kit (HTK), hybrid RNN systems with gated memorycells, and the synthesis unit, HMM based speech synthesis system HTS. Thissystem will initially be built as an English to Tamil translation device.
展开▼